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SIGNAL PROCESSING APPARATUS AND METHOD 



Background and Field of the Invention 
5 This invention relates to a method of signal processing and apparatus therefor. 

In many situations, observations are made of the output of a multiple input and multiple 
output system such as phase array radar system, sonar array system or microphone array 
system, from which it is desired to recover the wanted signal alone with all the 
10 unwanted signals, including noise, cancelled or suppressed. For example, in a 
microphone array system for a speech recognition application, the objective is to 
enhance the target speech signal in the presence of background noise and competing 



15 The most widely used approach to noise or interference cancellation in a multiple 
channel case was suggested by Widrow etc in "Adaptive Antenna Systems" Proc. IEEE, 
Vol.55 No.12, Dec. 1967 and "Signal Cancellation Phenomena in Antennas: causes and 
cures", IEEE Trans. Antennas Propag., Vol.AP30, May 1982. Also by L.J.Griffiths etc 
in " An Alternative Approach to Linearly Constrained Adaptive Beamforming". IEEE 

20 Trans. Antennas Propag. VolAP30, 1982. In these and other similar approaches, the 
signal processing apparatus separates the observed signal into a primary channel which 
comprises both the target signal and the interference signal and noise, and a secondary 
channel which comprises interference signal and noise alone. The interference signals 
and noise in the primary channel are estimated using an adaptive filter having the 

25 secondary channel signal as input, the estimated interference and noise signal being 
subtracted from the primary channel to obtain the desired target signal. 

There are two major drawbacks of the above approaches. The first is that it is assumed 
that the secondary channel comprises interference signals and noise only. This 
30 assumption may not be correct in practice due to leakage of wanted signals into the 
secondary channel due to hardware imperfections and limited array dimension. The 
second is that it is assumed that the interference signals and noise can be estimated 



speakers. 
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accurately from the secondary channel. This assumption may also not be correct in 
practice because this will required a large number of degrees of freedom, this implying 
a very long filter and large array dimension. A very long filter leads to other problems 
such as rate of convergence and instability. 

5 

The first drawback will lead to signal cancellation. This degrades the performance of the 
apparatus. Depending on the input signal power, this degradation may be severe, leading 
to poor quality of the reconstructed speech because a portion of the desired signal is also 
cancelled by the filtering process. The second drawback will lead to poor interference 
10 and noise cancellation especially low frequency interference signals the wavelengths of 
which are many times the dimension of the array. 

It is an object of the invention to provide an improved signal processing apparatus and 
method. 

15 

Summary of the Invention 



According to the invention in a first aspect, there is provided a method of processing 
signals received from an array of sensors comprising the steps of sampling and digitally 

20 converting the received signals and processing the digitally converted signals to provide 
an output signal, the processing including filtering the signals using a first adaptive filter 
arranged to enhance a target signal of the digitally converted signals and a second 
adaptive filter arranged to suppress an unwanted signal of the digitally converted signals 
and processing the filtered signals in the frequency domain to suppress the unwanted 

25 signal further. 

Further preferred features of the invention are recited in appendant claims 2-40. 

According to the invention in a second aspect, there is provided a method of calculating 
30 a spectrum from a coupled signal comprising the steps of: 

1 ) deriving a target signal component S and an interference signal component I from 
the coupled signal; 
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2) transforming the target and interference signal components into respective 
frequency domain equivalents F(S) and F(I);and 

3) constructing the spectrum P(S) and P(I) of at least one equivalent in accordance 
with: 

5 P(S)=|Real(F(S))| + |lmag(F(S))| + G[F(S)]*R(s) 

P(I)=lReal(F(I))| + |lmag(F(I))| + G[F(I)]*R(i) 
where Real and Imag refer to taking the absolute value of the real or imaginary part of 
the frequency domain equivalent R(s), R(i) are scalar adjustment factors and G[F(S)] and 
G[F(I)] are functions of F(S) and F(I) respectively. 

10 



According to the invention in a third aspect, there is provided a method of calculating 
a reverberation coefficient from a plurality of signals received from respective sensors 
in respective signal channels of a sensor array comprising the steps of: 
15 1) calculating a correlation time delay between signals from a reference one of the 
channels and another one of the channels using an adaptive filter; 

2) performing adaptive filtering, using a second adaptive filter, on the received 
signals;and 

3) calculating a reverberation coefficient from the filter coefficients of the first and 
20 second filters. 

According to the invention in a fourth aspect, there is provided a method of signal 
processing of a signal having wanted and unwanted components comprising the steps 
of: 

25 1) processing the signal in the time domain with at least one adaptive filter to 
enhance the wanted signal and/or reduce the unwanted signal, 

2) transforming the thus processed signal to the frequency domain; and 

3) performing at least one unwanted signal reduction process in the frequency 
domain. 

30 

The invention extends to apparatus for performing the method of the aformentioned 
aspects. 
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Each aspect of the invention is usable independently of the others, for example in other 
signal processing apparatus which need not include other features of this invention as 
described. 

5 The described embodiment of the invention discloses a method and apparatus to enhance 
an observed target signal from a predetermined or known direction of arrival. The 
apparatus cancels and suppresses the unwanted signals and noise from their coupled 
observation by the apparatus. An approach is disclosed to enhance the target signal in 
a more realistic scenario where both the target signal and interference signal and noise 
10 are coupled in the observed signals. Further, no assumption is made regarding the 
number or the direction of arrival of the interference signals. 

The described embodiment includes an array of sensors e.g. microphones each defining 
a corresponding signal channel, an array of receivers with preamplifiers, an array of 

15 analog to digital converters for digitally converting observed signals and a digital signal 
processor that processes the signals. From the observed signals, the apparatus outputs 
an enhanced target signal and reduces the noise and interference signals. The apparatus 
allows a tradeoff between interference and noise suppression level and signal quality. 
No assumptions are make about the number of interference signals and the characteristic 

20 of the noise. 

The digital signal processor includes a first set of adaptive filters which act as a signal 
spatial filter using a first channel as a reference channel. This filter removes the target 
signal "s" from the coupled signal and puts the remaining elements of the coupled 
25 signal, namely interference signals "u" and system noise "q" in an interference plus 
noise channel referred to as a Difference Channel. This filter also enhances the target 
signal V and puts this in another channel, referred to as the Sum Channel. The Sum 
Channel consists of the enhanced target signal "s" and the interference signals "u M and 
noise M q". 



The target signal "s" may not be removed completely from the Difference Channel due 
to the sudden movement of the target speaker or of an object within the vicinity of the 



30 
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speaker, so this channel may contain some residue target signal on occasions which can 
lead to some signal cancellation. However, the described embodiment greatly reduces 
this. 

5 The signals from the Difference Channel are fed to a second adaptive filter set. This 
set of filters adaptively estimates the interference signals and noise in the Sum Channel. 

The estimated signals are fed to an Interference Signal and Noise Cancellation and 
Suppression Processor which cancels and suppresses the noise and interference signals 
10 from the Sum Channel and outputs the enhanced target signal. 

Updating of the parameters of the sets of adaptive filters is performed using a further 
processor termed a Preliminary Signal Parameters Estimator which receives the observed 
signal and estimates the reverberation level of the signal, the system noise level, the 
15 signal level, estimate signal detection thresholds and the angle of arrival of the signal. 
This information is used by the decision processor to decide if any parameter update is 
required. 

One application of the described embodiment of the invention is speech enhancement 
20 in a car environment where the direction of the target signal with respect to the system 
is known. Yet another application is speech input for speech recognition applications. 
Again the direction of arrival of the signal is known. 

25 Brief Description of the Drawings 

An embodiment of the invention will now be described by way of example with 
reference to the accompanying drawings in which: 

30 Fig.l illustrates a general scenario where the invention may be used. 

Fig.2 is a schematic illustration of a general digital signal processing system embodying 
the present invention. 
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Fig.3 is a system level block diagram of the described embodiment of Fig. 2. 

Fig. 4a-c is a flow chart illustrating the operation of the embodiment of Fig. 3. 

Fig 5 illustrates a typical plot of nonlinear energy of a channel and the established 

thresholds. 

5 Fig.6(a) illustrates a wavefront arriving from 40 degree off-boresight direction 
Fig 6(b) represents a time delay estimator using an adaptive filter 
Fig 6(c) shows the impulse response of the filter indicates a wave front from the 
boresight direction. 

Fig 7 illustrates the reverberation level of the received signal over time. 
10 Fig 8 shows the schematic block diagram the four channel Adaptive Spatial Filter. 
Fig 9 shows the schematic block diagram of the Adaptive Interference and Noise 
Estimator of Fig. 3. 
Fig 10 shows an input signal buffer. 

Fig 1 1 shows the use of a Harming Window on overlapping blocks of signals. 
15 Fig. 12 illustrates a sudden rise of noise level of the nonlinear energy plot. 

Fig. 13 illustrates the readjustment of the thresholds to reflect the sudden rise of noise 
energy level. 

Detailed Descri ption of the Embodiment of the Invention 

20 

FIG.l illustrates schematically the operating environment of a signal processing 
apparatus 5 of the described embodiment of the invention, shown in a simplified 
example of a room. A target sound signal "s" emitted from a source s' in a known 
direction impinging on a sensor array, such as a microphone array 10 of the apparatus 

25 5, is coupled with other unwanted signals namely interference signals ul, u2 from other 
sources A,B» reflections of these signals ulr, u2r and the target signal's own reflected 
signal sr. These unwanted signals cause interference and degrade the quality of the 
target signal "s" as received by the sensor array. The actual number of unwanted signals 
depends on the number of sources and room geometry but only three reflected (echo) 

30 paths and three direct paths are illustrated for simplicity of explanation. The sensor 
array 10 is connected to processing circuitry 20-60 and there will be a noise input q 
associated with the circuitry which further degrades the target signal. 
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An embodiment of signal processing apparatus 5 is shown in FIG.2. The apparatus 
observes the environment with an array of four sensors such as microphones 10a- lOd. 
Target and noise/interference sound signals are coupled when impinging on each of the 
sensors. The signal received by each of the sensors is amplified by an amplifier 20a-d 
5 and converted to a digital bitstream using an analogue to digital converter 30a-d. The 
bit streams are feed in parallel to the digital signal processor 40 to be processed 
digitally. The processor provides an output signal to a digital to analogue converter 50 
which is fed to a line amplifier 60 to provide the final analogue output. 

10 FIG.3 shows the major functional blocks of the digital processor in more detail. The 
multiple input coupled signals are received by the four-channel microphone array 10a- 
lOd, each of which forms a signal channel, with channel 10a being the reference 
channel. The received signals are passed to a receiver front end which provides the 
functions of amplifiers 20 and analogue to digital converters 30 in a single custom chip. 

15 The four channel digitized output signals are fed in parallel to the digital signal 
processor 40. The digital signal processor 40 comprises four sub-processors. They are 
(a) a Preliminary Signal Parameters Estimator and Decision Processor 42, (b) a Signal 
Adaptive Spatial Filter 44, (c) an Adaptive Linear Interference and Noise Estimator 46, 
and (d) an Adaptive Interference and Noise Cancellation and Suppression Processor 48. 

20 The basic signal flow is from processor 42, to processor 44, to processor 46, to 
processor 48. These connections being represented by thick arrows in Fig. 3. The 
filtered signal S is output from processor 48. Decisions necessary for the operation of 
the processor 40 are generally made by processor 42 which receives information from 
processors 44 - 48, makes decisions on the basis of that information and sends 

25 instructions to processors 44 - 48, through connections represented by thin arrows in Fig. 

J. 

It will be appreciated that the splitting of the processor 40 into the four component parts 
42, 44, 46, 48 is essentially notional and is made to assist understanding of the operation 
30 of the processor. The processor 40 would in reality be embodied as a single multi- 
function digital processor performing the functions described under control of a program 
with suitable memory and other peripherals. 
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A flowchart illustrating the operation of the processors is shown in Figs 4a-c and this 
will firstly be described generally. A more detailed explanation of aspects of the 
processor operation will then follow. 

5 The front end 20,30 processes samples of the signals received from array 10 at a 
predetermined sampling frequency, for example 16kHz. The processor 42 includes an 
input buffer 43 that can hold N such samples for each of the four channels. Upon 
initialization, the apparatus collects a block of N/2 new signal samples for all the 
channels at step 500, so that the buffer holds a block of N/2 new samples and a block 
10 of N/2 previous samples. The processor 42 then removes any DC from the new samples 
and preemphasizes or whitens the samples at step 502. 

There then follows a short initialization period at step 504 in which the first 20 blocks 
of N/2 samples of signal after start-up are used to estimate the environment noise energy 
15 E n and two detection thresholds, a noise threshold T nl and a larger signal threshold T n ->, 
are calculated by processor 42 from E n using scaling factors. During this short period, 
an assumption is made that no target signals are present. These signals do, however, 
continue to be processed, so that an initial Bark Scale system noise value may be 
derived at step 570, below. 

20 

After this initialisation period, the energies and thresholds update automatically as 
described below. The samples from the reference channel 10a are used for this purpose 
although any other channel could be used. 

25 The total non-linear energy of the signal samples E r is then calculated at step 506. 

At step 508, it is determined if the signal energy E r is greater than the signal threshold 
T nl . If not, the environment noise E n and the two thresholds are updated at step 510 
using the new value of E r calculated in step 506. The Bark Scale system noise B n (see 
30 below) is also similarly updated via point F. The routine then moves to point B. If so, 
the signal is passed to a threshold adjusting sub-routine 512-518. 
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Steps 512-518 are used to compensate for abrupt changes in environment noise level 
which may capture the thresholds. A time counter is used to determine if the signal level 
shows a steady state increase which would indicate an increase in noise, since the 
speech target signal will show considerable variation over time and thus can be 
5 distinguished. This is illustrated in Fig. 12 in which a signal noise level rises from an 
initial level to a new level which exceeds both thresholds. At step 512 a time counter 
C c is incremented. At step 514 C c is checked against a threshold T cc . If the threshold 
is not reached, the program moves to step 520 described below. If the threshold is 
reached, the estimated noise energy E n is then increased at step 516 by a multiple a and 
10 E n , T nl and T n2 are updated at step 518. The effect of this is illustrated in Fig. 13. The 
counter is reset and updating ceases when the the signal energy E r is less than the 
second threshold T n2 as tested at step 520 below. 

A test is made at step 520 to see if the estimated energy E r in the reference channel 10a 
15 exceeds the second threshold T n: . If so, a candidate target signal is deemed to be present. 
The apparatus only wishes to process candidate target signals that impinge on the array 
1 0 from a known direction normal to the array, hereinafter referred to as the boresight 
direction, or from a limited angular departure therefrom, in this embodiment plus or 
minus 15 degrees. Therefore the next stage is to check for any signal arriving from this 
20 direction. 

At step 524 two coefficients are established, namely a correlation coefficient C x and a 
correlation time delay T d . which together provide an indication of the direction from 
which the target signal arrived. 

25 

At step 526, two tests are conducted to determine if the candidate target signal is an 
actual target signal. First, the crosscorrelation coefficient C x must exceed a 
predetermined threshold T c and, second, the size of the time delay coefficient must be 
less than a value 9 indicating that the signal has impinged on the array within the 
30 predetermined angular range. If these conditions are not met, the signal is not regarded 
as a target signal and the routine passes to point B. If the conditions are met, the 
routine passes to point A. 
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If at step 520, the estimated energy E r in the reference channel 10a is found not to 
exceed the second threshold T n2 , the target signal is considered not to be present and the 
routine passes to point B via step 522 in which the counter C c is reset. This is done 
since the second threshold at this point is above the level of the total signal energy E r 
5 indicating that the threshold must be, consequently, above the environment noise energy 
level E„ and thus updating of E n is no longer necessary. 

Thus, the signal has, by points A and B, been preliminarily classified into a target signal 
(point A) or a noise signal (point B). 

10 

Following point A, the signal is subject to a further test at steps 528-532. At step 528, 
it is determined if the filter coefficients W su of filter 44 have yet been updated. If not, 
the subsequent steps 530, 532 are skipped, since these rely on the coefficients of filter 
44 for calculation purposes. If so, a reverberation coefficient C„ which provides a 
15 measure of the degree of reverberation of the signal is calculated and at step 532 it is 
determined if exceeds a threshold T„ If so, this indicates an acceptable level of 
reverberation in the signal and the routine passes to step 534 (target signal filtering). 
If not, the signal joins the path from point B to step 536 (non-target signal filtering). 

20 The now confirmed target signal is fed to the Signal Adaptive Spatial Filter 44, the 
purpose of which is to enhance the target signal. The filter is instructed to perform 
adaptive filtering at steps 534 and 538, in which the filter coefficients W su are adapted 
to provide a "target signal plus noise" signal in the reference channel and "noise only" 
signals in the remaining channels using the Least Mean Square (LMS) algorithm. The 

25 filter 44 output channel equivalent to the reference channel is for convenience referred 
to as the Sum Channel and the filter 44 output from the other channels, Difference 
Channels. The signal so processed will be, for convenience, referred to as A'. 

If the signal is considered to be a noise signal, the routine passes to step 536 in which 
30 the signals are passed through filter 44 without the filter coefficients being adapted, to 
form the Sum and Difference channel signals. The signals so processed will be referred 
to for convenience as B'. 
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The effect of the filter 44 is to enhance the signal if this is identified as a target signal 
but not otherwise. 

At step 540, an energy ratio R sd between the Sum Channel and the Difference Channels 
5 is estimated by processor 42. At step 542 two tests are made. First, if the signals are 
A 1 signals from step 534, the routine passes to step 550. Second, for those signals for 
which E >T n2 (i.e., high energy level), is compared to a threshold T sd . If the ratio 
is lower than T^,, this indicates probable noise but if higher, this may indicate that there 
has been some leakage of the target signal into the Difference channel, indicating the 
10 presence of a target signal after all. For such target signals the routine also passes to 
step 550. For all other non-target signals, the routine passes to step 544. 

At steps 544-560, the signals are processed by the Adaptive Linear Interference and 
Noise Estimation Filter 46, the purpose of which is to reduce the unwanted signals. The 
15 filter 46, at step 544, is instructed to perform adaptive filtering on the non- target signals 
with the intention of adapting the filter coefficients to reducing the unwanted signal in 
the Sum channel to some small error value e c . 

To further prevent signal cancellation, the norm of the filter coefficients is calculated 
20 by processor 42 at step 546. If this norm exceeds a predetermined value [T no ] at step 
548, then the filter coefficients are scaled at step 549 to a reduced value. 

In the alternative, at step 550, the target signals are fed to the filter 46 but this time, no 
adaptive filtering takes place, so the Sum and Difference signals pass through the filter. 



An output of the Sum Channel signal without alteration is also passed through the filter 



The output signals from processor 46 are thus the Sum channel signal S c (point C), 
30 filtered Difference signals D c (point E) and the error signal e c (point D). At step 562, 
a weighted average S(t) of the error signal e c and the Sum Channel signal is calculated 
and the signals from the Difference channels D c are Summed to form a single signal I(t). 



25 



46. 
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These signals S(t) and I(t) are then collected for the new N/2 samples and the last N/2 
samples from the previous block and a Harming Window H„ is applied to the collected 
samples as shown in Fig. 10 to form vectors S h and I h . This is an overlapping technique 
with overlapping vectors S n ,I n being formed from past and present blocks of N/2 samples 
5 continuously. This is illustrated in Fig. 1 1 . A Fast Fourier Transform is then performed 
on the vectors S h and I h to transform the vectors into frequency domain equivalents S r 
and I„ at step 564. 

At step 566 a modified spectrum is calculated for the transformed signals to provide 
10 "pseudo" spectrum values P s and P f and these values are warped into the same Bark 
Frequency Scale to provide Bark Frequency scaled values B s and B ; at step 568. 

The Bark value B„ of the system noise of the Sum Channel is updated at step 570 using 
B s and the previous value of B n , if the condition at step 508 is met (through path F). 
15 At start-up, B„ is initially calculated at this block whether or not the condition is met. 
At this time, there must be no target signal present, thus requiring a short initialization 
period after signal detection has begun, for this initial B„ value to be established. 

A weighted combination By of B n and B f is then made at step 572 and this is combined 
:0 with B s to compute the Bark Scale nonlinear gain G b at step 574. 

G b is then unwarped to the normal frequency domain to provide a gain value G at step 
578 and this is then used at step 580 to compute an output spectrum S oul using the signal 
spectrum S f from step 564. This gain-adjusted spectrum suppresses both the interference 
5 signals, the environmental noise and system noise. 

An inverse FFT is then performed on the spectrum S out at step 582 and the output signal 
is then reconstructed from the overlapping signals using the overlap add procedure at 
step 584. 

0 

Major steps in the above described flowchart will now be described in more detail. 
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NonLinear Energy and Threshold Estimation and Updating (STEPS 506.510) 

The processor 42 estimates the energy output from a reference channel. In the four 
channel example described, channel 10a is used as the reference channel. 

N/2 samples of the digitized signal are buffered into a shift register to form a signal 
vector of the following form: 



X(0) 
X(l) 



x(jr-i) 



. .a. l 



10 



Where J = N/2. The size of the vector depends on the resolution requirement. In the 
preferred embodiment, 3=256 samples. 

The nonlinear energy of the vector is then estimated using the following equation: 



15 



i J ' 2 



.A. 2 



When the system is initialized, the average system and environment noise energy is 
estimated using the first 20 blocks of signal. A first order recursive filter is used to 
carry out this task as shown below:. 



20 



E^ x = aS* + (l-a)£^ +1 . . .A.3 



Where the superscript K is the block number and a is an empirically chosen weight 
between zero and one. In this embodiment, a = 0.9. 
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Once the noise energy E n is obtained, the two signal detection thresholds Tnl and T n2 
are established as follows: 

T nl = 8^ A* 4 



^n2 ^2 E n • • • " A . 5 

8 I and 5 2 are scalar values that are used to select the thresholds so as to optimize signal 
5 detection and minimize false signal detection. As shown in Fig. 5, T nl should be above 
the system noise level, with T n2 sufficient to be generally breached by the potential 
target signal. These thresholds may be found by trial and error. In this embodiment, 
5,=1.125 and 5 2 =1.8 have been found to give good results. 

10 Once the thresholds have been established, E n may be updated after initialization in step 
510 as follows: 



E n =a n + <l-a)E r 
Else 

• . . A. 6 

15 The updated thresholds may then be calculated according to equations A.4 and A.5. 
Time Delay Estimation OVl (STEP 524^ 



FIG 6 A illustrates a single wave front impinging on the sensor array. The wave front 
20 impinges on sensor lOd first (A as shown) and at a later time impinges on sensor 10a 
(A* as shown), after a time delay t d . This is because the signal originates at an angle of 
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40 degrees from the boresight direction. If the signal originated from the boresight 
direction, the time delay t d will have been zero ideally. 

Time delay estimation of performed using a tapped delay line time delay estimator 
5 included in the processor 42 which is shown in Fig. 6B. The filter has a delay element 
600 .having a delay Z 172 , connected to the reference channel 10a and a tapped delay line 
filter 610 having a filter coefficient W td connected to channel lOd. Delay element 600 
provides a delay equal to half of that of the tapped delay line filter 610. The outputs 
from the delay element is d(k) and from filter 610 is d'(k). The Difference of these 
10 outputs is taken at element 620 providing an error signal e(k) (where k is a time index 
used for ease of illustration). The error is fed back to the filter 610. The Least Mean 
Squares (LMS) algorithm is used to adapt the filter coefficient W td as follows: 



W M (k+l) = W td (k) + 2n cd S 10d (k)e(k) 



. .B. 1 



W M (ic+l) = 



w£[k+i) 



B. 2 



S 10 d<*> = 



5 l0d 



B . 3 
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e(k) = d(k) - d'(Jc) B4 

d'(Jc) = w cd (k) T .S 10d (k) B.5 



\\S 10d (k) II 



5 where p td is a user selected convergence factor 0<P td <2, || || denoted the norm of a 
vector, k is a time index, L G is the filter length. 

The impulse response of the tapped delay line filter 620 at the end of the adaptation is 
shown in Fig. 6c. The impulse response is measured and the position of the peak or the 

10 maximum value of the impulse response relative to origin O gives the time delay T d 
between the two sensors which is also the angle of arrival of the signal. In the case 
shown, the peak lies at the centre indicating that the signal comes from the boresight 
direction (T d =0). The threshold 9 at step 506 is selected depending upon the assumed 
possible degree of departure from the boresight direction from which the target signal 

15 might come. In this embodiment, 0 is equivalent to 
± 15°. 

Normalized Cross Correlation Estimation C.. (STEP 524} 

20 The normalized crosscorrelation between the reference channel 10a and the most distant 
channel lOd is calculated as follows: 

Samples of the signals from the reference channel 10a and channel lOd are buffered into 
shift registers X and Y where X is of length J samples and Y is of length K samples, 
25 where J>K, to form two independent vectors X r and Y r : 
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x r (l) 
x r (2) 



K r = 



C. 1 



xAJ) 



y r (D 
y r (2) 



C. 2 



A time delay between the signals is assumed, and to capture this Difference, J is made 
5 greater than K. The Difference is selected based on angle of interest. The normalized 
cross-correlation is then calculated as follows: 



C X U) = 



Yr * X, 



zl 



|r r ll* r2 l 



. C.3 



X r (l*l) 



Where 



C.4 



X Z (K+1-1) 



10 Where T represents the transpose of the vector and HI represent the norm of the vector 
and 1 is the correlation lag. 1 is selected to span the delay of interest. For a sampling 
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frequency of 16kHz and a spacing between sensors 10a, lOd of 18cm, the lag 1 is 
selected to be five samples for an angle of interest of 15°. 

The threshold T c is determined empirically. T c = 0.85 is used in this embodiment. 

5 

Signal Reverberation Estimation C r . (STEP 530*1 

The degree of reverberation of the received signal is calculated using the time delay 
estimator filter weight [W td ] used in calculation of T d above and the set of spatial filter 
10 weights [W su ] from filter 44 (described below) as shown in the following equation: 

C rv - . . . .D.l 

Where T represents the transpose of the vector and M is the channel associated with the 
filter coefficient W su . In this embodiment, three values for C n ., one for each filter 
coefficient W su are calculated. The largest is taken for subsequent processing. 

15 

The threshold T„ used in step 506 is selected to ensure that the signal is selected as a 
target signal only when the level of reverberation is moderate, as illustrated in Fig. 7. 

Adaptive Spatial Filter 44 fSTEPS 534.536^ 

20 

FIG.8 shows a block diagram of the Adaptive Linear Spatial Filter 44. The function of 
the filter is to separate the coupled target interference and noise signals into two types. 
The first, in a single output channel termed the Sum Channel, is an enhanced target 
signal having weakened interference and noise i.e. signals not from the target signal 
25 direction. The second, in the remaining channels termed Difference Channels, which in 
the four channel case comprise three separate outputs, aims to comprise interference and 
noise signals alone. 



The objective is to adapt the filter coefficients of filter 44 in such a way so as to 
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enhanced the target signal and output it in the Sum Channel and at the same time 
eliminate the target signal from the coupled signals and output them into the Difference 
Channels. 

5 The adaptive filter elements in filter 44 act as linear spatial prediction filters that predict 
the signal in the reference channel whenever the target signal is present. The filter stops 
adapting when the signal is deemed to be absent. 

The filter coefficients are updated whenever the conditions of steps 504 and 506 are 
10 met, namely: 

(i) The adaptive threshold detector detects the presence of signal; 

(ii) The time delay estimator indicates that the signal arrived from the predetermined 
angle; 

(iii) The normalized cross correlation of the signal exceeds the threshold; and 
15 (iv) The reverberation level is low. 

As illustrate in FIG.8, the digitized coupled signal X 0 from sensor 10a is fed through 
a digital delay element 710 of delay Z" Lsu/2 . Digitized coupled signals X^X^ from 
sensors 10b,10c,10d are fed to respective filter elements 712,4,6. The outputs from 

20 elements 710,2,4,6 are Summed at Summing element 718, the output from the Summing 
element 718 being divided by four at divider element 719 to form the Sum channel 
output signal. The output from delay element 710 is also subtracted from the outputs of 
the filters 712,4,6 at respective Difference elements 720,2,4, the output from each 
Difference element forming a respective Difference channel output signal, which is also 

25 fed back to the respective filter 712,4,6. The function of the delay element 710 is to 
time align the signal from the reference channel 10a with the output from the filters 



712,4,6. 



The filter elements 712,4,6 adapt in parallel using the LMS algorithm given by 
30 Equations E.L.E.8 below, the output of the Sum Channel being given by equation E.l 
and the output from each Difference Channel being given by equation E.6: 
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S c (k) = (S(k) +Jf 0 (ic))/4 . ..£M 



M-l 

Where: S(k) = S^ik) ...E.2 



Sjk) = (Wj? u (k)) T X m (k) E.3 

Where m is 0,1,2...M-1, the number of channels, in this case 0..3 and T denotes the 
transpose of a vector. 



E.A 



E.5 



Where X m (k) and W su m (k) are column vectors of dimension L *1. 



The weight W su m (k) is updated using the LMS algorithm as follows: 
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d cm (k) = X 0 (k) - S w (k) ...E.6 
POie+1) = fC(Jc) + 2^ u * m (Jc)<5 cm (/c) £.7 



and where p su is a user selected convergence factor 0<P SU <2, || H denoted the norm of 
a vector and k is a time index. 

Calculation of Energy Ratio R,/step 5401 
This is performed as follows: 



S c (0) 



. F. 1 



' a e (o) 








a c2 (o) 




■ a c3 {o) 


a c (D 












a c3 n) 








+ 




+ 




a c (j-D 




a cl (j-D 




a c2 (j-D 




a c3 (j-D 



J=N/2, the number of samples, in this embodiment 256. 
10 Where E SUM is the sum channel energy and E DIF is the difference channel energy. 

= "t^pE S cU) 2 - 5 c (j-l) 5 c (j-l) F.3 
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■tj=2)T, a cO') 2 - a e u-i) a c (j-D 



F.4 



R sd ~ 



-'SUM 



.F.5 



•'DIF 



J The energy ratio between the Sum Channel and Difference Channel (R sd ) must not 

j£j exceed a predetermined threshold. In the four channel case illustrated here the threshold 

M> 5 isdetermined to be about 1.5. 

Ul Adaptive Interference and Noise Estimation Filter 46 fSTEPS 544.550^ . 

jY FIG.9 shows a schematic block diagram of the Adaptive Interference and Noise 

J- 10 Estimation Filter 46. This filter estimates the noise and interference signals and 
m subtracts them from the Sum Channel so as to derive an output with reduced noise and 

interference. 

The filter 46 takes outputs from the Sum and Difference Channels of the filter 44 and 
15 feeds the Difference Channel Signals in parallel to another set of adaptive filter elements 
750,2,4 and feeds the Sum Channel signal to a corresponding delay element 756. The 
outputs from the three filter elements 750,2,4 are subtracted from the output from delay 
element 756 at Difference element 758 to form an error output e c? which is also fed back 
to the filter elements 750,2,4. The output from filter element 756 is also passed 
20 directly as an output, as are the outputs from the three filter elements 750,2,4. 

Again, the Least Mean Square algorithm (LMS) is used to adapt the filter coefficients 
Wuq as follows: 



25 
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Af-l 



e c (k) = S c {k) - J^d^ik) ...G.I 



J7!=l 



Where: S cm (k) = w' (k) T . y*(ic) . . .g.2 



Y m {k) = 



G.3 



Fug 



u<7 



G.5 



and where P uq is a user selected convergence factor 0<P uq <2 and where m is 0,1,2...M-1, 
the number of channels, in this case 0..3. 

Calculation of Norm of filter coefficients fstep 5461 

The norms of the coefficients of filters 750,2,4 are also constrained to be smaller than 
a predetermined value. The rationale for imposing this constraint is because the norm 
of the filter coefficients will be large if a target signal leaks into the Difference Channel. 
Scaling down the norm value of the filter coefficients will reduce the effect of signal 
cancellation. 
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This is calculated as follows: 

If'. I*Q > T ao G.6 



Then: W u % = -^*C no . . . .G.7 



Where m is 1,2...M-1, the channels having W uq filters. T no is a predetermined threshold 
and C n0 is a scaling factor, both of which can be estimated empirically. 

5 

The output e c from equation F.l is almost interference and noise free in an ideal 
situation. However, in a realistic situation, this can not be achieved. This will cause 
signal cancellation that degrades the target signal quality or noise or interference will 
feed through and this will lead to degradation of the output signal to noise and 
10 interference ratio. The signal cancellation problem is reduced in the described 
embodiment by use of the Adaptive Spatial Filter 44 which reduces the target signal 
leakage into the Difference Channel. However, in cases where the signal to noise and 
interference is very high, some target signal may still leak into these channels. 



15 To further reduce the target signal cancellation problem and unwanted signal feed 
through to the output, The output signals from processor 46 are fed into the Adaptive 
NonLinear Interference and Noise Suppression Processor 48 as described below. 

Adaptive NonLine ar Interference and Noise Suppression Processor 48 (STEPS 562-584> 

20 

This processor processes input signals in the frequency domain coupled with the well- 
known overlap add block processing technique. 

STEP 562: The output signal (e c ) and the Sum Channel output signal (S c ) combined 
25 as a weighted average as follows: 



Sit) = W 1 S c (t) + W 2 e e (t) H.l 
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The weights (W„W 2 ) can be empirically chosen to minimize signal cancellation or 
improve unwanted signal suppression. In this embodiment, W,=W 2 =0.5. 

5 This combined signal is buffered into a memory as illustrated in FIG. 10. The buffer 
consists of N/2 of new samples and N/2 of old samples from the previous block. 
Similarly, the unwanted signals from the Difference Channel are summed in accordance 
with the following and buffered the same way as the Sum Channel: 



M-l 



H.2A 



2-1 



Where: £>. - 



<*ci<0> 



.H.2B 



10 Where i=l,2...M-l and M is the number of channels, in this case M=4. 

A Hanning Window is then applied to the N samples buffered signals as illustrated in 
FIG. 11 expressed mathematically as follows: 



S h = 



'(t+2) 



H n ...if. 3 



15 
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(t+i) 

(t+2) 



. .H.A 



Where (H n ) is a Hanning Window of dimension N, N being the dimension of the buffer. 
The "dot" denotes point by point multiplication of the vectors, t is a time index. 

5 Step 5.64: The resultant vectors [S h ] and [I h ] are transformed into the frequency 
domain using Fast Fourier Transform algorithm as illustrated in equations H.5 and H.6 
below: 

S f = FFT(S h ) H.5 

I f = FFT(I h ) H.6 

Step 566: A modified spectrum is then calculated, which is illustrated in Equations 
10 H.7 and H.8: 



P s = \Real (S f ) | + \lmag(S f ) \ + F(S f ) *r s .H.l 

P ± = \Real ( I f ) J + \lmag(I f ) \ + F(I f ) . . .H. 8 

Where "Real" and "Imag" refer to taking the absolute values of the real and imaginary 
parts, r s and r; are scalars and F(S r ) and F(l f ) denotes a function of S f and Irrespectively. 



15 



One preferred function F using a power function is shown below in equations H.9 and 



WO 00/30264 



PCT/SG99/00119 



27 



H.10 where "Cory" denotes the complex conjugate: 

P £ = \Real(S f ) | + \lmag{S f ) | + (S f *conj (S f ) ) *r s . . .H.9 
P ± - \Real(I f )\ + \lmag{I f ) | + (I f *conj (T f ) ) *r i ...i/.lO 

A second preferred function F using a multiplication function is shown below in 
5 equations H.ll and H.12: 

P 5 = \Real(S f ) | + \lwag(S f ) \ + \Real(S f ) | * \lmag{S f ) \ *r £ if.il 

P i = l^eaKX^I + \lmag{I f ) \ + |i?eai(J / )| * 1x^3^(^)1 *r A H.12 

The values of the scalars (r s and Tj) control the tradeoff between unwanted signal 
suppression and signal distortion and may be determined empirically. (r s and Tj) are 
10 calculated as 1/(2 VS ) and l/(2 vi ) where vs and vi are scalars. In this embodiment, vs=vi 
is chosen as 8 giving r^ = 1/256. As vs,vi reduce, the amount of suppression will 
increase. 

Step 568: The Spectra (P s ) and (P f ) are warped into (Nb) critical bands using the 
15 Bark Frequency Scale [see Lawrence Rabiner and Bing Hwang Juang, Fundamentals of 
Speech Recognition, Prentice Hall 1993]. The number of Bark critical bands depend on 
the sampling frequency used. For a sampling of 16Khz, there will be Nb = 25 critical 
bands. The warped Bark Spectrum of (PJ and (P;) are denoted as (B s ) and (B;). 

20 Step 570: A Bark Spectrum of the system noise and environment noise is similarly 
computed and is denoted as (B n ). B n is first established during system initialization as 
B n =B s and continues to be updated when no target signal is detected (step 508) by the 
system i.e. any silence period. B n is updated as follows: 



If E r <T, 



n! 
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B n = aB n + (l-a)B s 

Else 

B n = B n 

H.13 

5 Where 0<a<l; in this embodiment, a=0.9 
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Steps 572,574: Using (B s , B f and B n ) a nonlinear technique is used to estimate a 

gain (G b ) as follows : 

10 First the unwanted signal Bark Spectrum is combined with the system noise Bark 
Spectrum using an appropriate weighting function as illustrate in Equation J.l. 

B y = Q 1 S i + Q 2 B n . . . j. i 

Q, and Q, are weights which can be chosen empirically so as to maximize unwanted 
15 signals and noise suppression with minimize signal distortion. 

Follow that a post signal to noise ratio is calculated using Equations J.2 and J.3 below: 

R P o = -s 5 ... J". 2 

B y 

R pp = R po ~ J c ••• J. 3 

The division in equation J.2 means element by element division and not vector division. 
20 and Rp p are column vectors of dimension Nb* 1 , Nb being the dimension of the Bark 
Scale Critical Frequency Band and I c is a column unity vector of dimension Nb*l as 
shown below: 



25 
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L po(D 



R 



'po 



. J\4 



L pp(l) 

r PP(2) 



pp 



c7. 5 



L pp<*r b ) 

"l" 
1 



. «7.6 



If any of the r pp (nb)elements of Rp p are less than zero, they are set equal to zero. 

Using the Decision Direct Approach [see Y. Ephraim and D. Malah: Speech 
5 Enhancement Using Optimal NonLinear Spectrum Amplitude Estimation; Proc. IEEE 
International Conference Acoustics Speech and Signal Processing (Boston) 1983, 
pplll8-112L], the a-priori signal to noise ratio Rp r is calculated as follows: 

R pr = (1-Pi)*^ + Pi ^ . . .J.7 

The division in Equation J.7 means element by element division. B 0 is a column 
10 vector of dimensions Nb*l and denotes the output signal Bark Scale Bark Spectrum 
from the previous block B D =G b .B s (see Eqn J. 15) (B G initially is zero). Rp r is also a 
column vector of dimension Nb*l. The value of p t is given in Table 1 below: 
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i 


P 


1 


0.01625 


2 


0.01225 


<■> 


0.245 


4 


0.49 


5 


0.98 



TABLE 1 



10 The value i is set equal to 1 on the onset of a signal and the p value is therefore equal 
to 0.01625. Then the i value will count from 1 to 5 on each new block of N/2 samples 
processed and stay at 5 until the signal is off. The i will start from 1 again at the next 
signal onset and the p is taken accordingly. 

15 Instead of p being constant, in this embodiment p is made variable and starts at a small 
value at the onset of the signal to prevent suppresion of the target signal and increases, 
preferably exponentially, to smooth Rp r . 

From this, is calculated as follows: 



20 

The division in Equation J. 8 is again element by element. is a column vector of 
dimension Nb*l. 



From this, L x is calculated: 
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L x = R IC . R po . . . J . 9 



The value of L x is limited to Pi («3.14). The multiplication in Equation J.9 means 
element by element multiplication. L x . is a column vector of dimension Nb* 1 as shown 
below: 



-M2) 
l x (nb) 
lANb) 



.J. 10 



10 



A vector L y of dimension Nb*l is then defined as: 



i y (l) 
l y (2) 



ly(nb) 



L l y (Nb) 



J. 11 



15 Where nb = l,2...Nb. Then L y is given as: 
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J. 12 



and 



E(nb) = -0.57722 - log ( l x (nb) ) + l x (nb) - (l x (nb) ) 2 /4 + 

l x (nb) 3 /8 = l x (nb)*/96. . . 



. . . J. 13 



E(nb) is truncated to the desired accuracy. L y can be obtained using a table look-up 
approach to reduce computational load. 

Finally, the Gain G b is calculated as follows: 

- R zr .L y ...J". 14 

The "dot" again implies element by element multiplication. G b is a column vector of 
dimension Nb* 1 as shown: 



Step 578: As G b is still in the Bark Frequency Scale, it is then unwarped back to 
the normal linear frequency scale of N dimensions. The unwarped G b is denoted as G. 



^(1) ' 
9(2) 



G b = 



g{nb) 



. . . J. 15 



.g(Nb) 
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The output spectrum with unwanted signal suppression is given as: 

S f = G.S f . . . J", 16 

The "dot" again implies element by element multiplication. 
5 Step 580: The recovered time domain signal is given by: 

S t = Real ( IFFT{S f ) ) ... J". 17 

IFFT denotes an Inverse Fast Fourier Transform, with only the Real part of the inverse 
transform being taken. 

0 

Step 584: Finally, the output time domain signal is obtained by overlap add with the 
previous block of output signal : 









' z c (i) ■ 




S c (2) 




Z t (2) 






+ 






S t (N/2) 




Z c (N/2) 



Where : 



Z c = 



{l+N/2) 
3^(2^/2) 



. . . <J. 19 



S t . x (N) 
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The embodiment described is not to be construed as limitative. For example, there can 
be any number of channels from two upwards. Furthermore, as will be apparent to one 
skilled in the art, many steps of the method employed are essentially discrete and may 
be employed independently of the other steps or in combination with some but not all 
5 of the other steps. For example, the adaptive filtering and the frequency domain 
processing may be performed independently of each other and the frequency domain 
processing steps such as the use of the modified spectrum, warping into the Bark scale 
and use of the scaling factor p can be viewed as a series of independent tools which 
need not all be used together. 



Use of first, second etc. in the claims should only be construed as a means of 
identification of the integers of the claims, not of process step order. Any novel feature 
or combination of features disclosed is to be taken as forming an independent invention 
whether or not specifically claimed in the appendant claims of this application as 
15 initially filed. 
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